Overview

Dataset statistics

Number of variables11
Number of observations775
Missing cells1742
Missing cells (%)20.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory66.7 KiB
Average record size in memory88.2 B

Variable types

NUM10
CAT1

Reproduction

Analysis started2020-11-13 02:01:18.124939
Analysis finished2020-11-13 02:01:46.573587
Duration28.45 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

homePrice_index is highly correlated with cpi_rent and 1 other fieldsHigh correlation
cpi_rent is highly correlated with homePrice_index and 1 other fieldsHigh correlation
ppi_resConstruct is highly correlated with cpi_rent and 1 other fieldsHigh correlation
uspop_growth has 60 (7.7%) missing values Missing
med_hIncome has 355 (45.8%) missing values Missing
homePrice_index has 373 (48.1%) missing values Missing
newHouse_starts has 36 (4.6%) missing values Missing
ppi_resConstruct has 365 (47.1%) missing values Missing
resConstruct_spending has 553 (71.4%) missing values Missing
DATE has unique values Unique

Variables

DATE
Categorical

UNIQUE

Distinct count775
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size6.1 KiB
1998-06-01
 
1
2017-06-01
 
1
2017-03-01
 
1
2015-02-01
 
1
1993-03-01
 
1
Other values (770)
770
ValueCountFrequency (%) 
1998-06-0110.1%
 
2017-06-0110.1%
 
2017-03-0110.1%
 
2015-02-0110.1%
 
1993-03-0110.1%
 
1964-09-0110.1%
 
1986-01-0110.1%
 
1984-11-0110.1%
 
1987-10-0110.1%
 
2007-03-0110.1%
 
Other values (765)76598.7%
 
2020-11-12T21:01:46.811882image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

uspop_growth
Real number (ℝ≥0)

MISSING

Distinct count59
Unique (%)8.3%
Missing60
Missing (%)7.7%
Infinite0
Infinite (%)0.0%
Mean1.0067087432455932
Minimum0.473953539373292
Maximum1.6577300373895298
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:47.155042image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0.4739535394
5-th percentile0.6310078932
Q10.893829201
median0.9642539171
Q31.16341162
95-th percentile1.404081667
Maximum1.657730037
Range1.183776498
Interquartile range (IQR)0.2695824189

Descriptive statistics

Standard deviation0.2364318167
Coefficient of variation (CV)0.2348562266
Kurtosis0.3601088
Mean1.006708743
Median Absolute Deviation (MAD)0.1364078754
Skewness0.2397468682
Sum719.7967514
Variance0.05590000395
2020-11-12T21:01:47.403672image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.4739535394192.5%
 
0.9217131672121.5%
 
0.9241641571121.5%
 
0.9897413822121.5%
 
0.8658173363121.5%
 
0.9254839689121.5%
 
1.389046055121.5%
 
0.9595899228121.5%
 
1.439164762121.5%
 
1.250171646121.5%
 
Other values (49)58875.9%
 
(Missing)607.7%
 
ValueCountFrequency (%) 
0.4739535394192.5%
 
0.5223373579121.5%
 
0.6310078932121.5%
 
0.6867731556121.5%
 
0.7166694134121.5%
 
ValueCountFrequency (%) 
1.657730037121.5%
 
1.537997358121.5%
 
1.439164762121.5%
 
1.389046055121.5%
 
1.386885692121.5%
 

med_hIncome
Real number (ℝ≥0)

MISSING

Distinct count35
Unique (%)8.3%
Missing355
Missing (%)45.8%
Infinite0
Infinite (%)0.0%
Mean57691.8
Minimum51742.0
Maximum63179.0
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:47.630767image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum51742
5-th percentile52709
Q155716
median57856
Q360038
95-th percentile62626
Maximum63179
Range11437
Interquartile range (IQR)4322

Descriptive statistics

Standard deviation2906.43916
Coefficient of variation (CV)0.0503787221
Kurtosis-0.8768863598
Mean57691.8
Median Absolute Deviation (MAD)2140
Skewness-0.04949544534
Sum24230556
Variance8447388.59
2020-11-12T21:01:47.835868image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
55931121.5%
 
51742121.5%
 
55260121.5%
 
54233121.5%
 
59286121.5%
 
61399121.5%
 
61526121.5%
 
61779121.5%
 
60178121.5%
 
54608121.5%
 
Other values (25)30038.7%
 
(Missing)35545.8%
 
ValueCountFrequency (%) 
51742121.5%
 
52709121.5%
 
53610121.5%
 
53897121.5%
 
54233121.5%
 
ValueCountFrequency (%) 
63179121.5%
 
62626121.5%
 
61779121.5%
 
61526121.5%
 
61399121.5%
 

rentl_vacnyRate
Real number (ℝ≥0)

Distinct count58
Unique (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.33741935483871
Minimum5.0
Maximum11.1
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:48.058792image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile5.17
Q16
median7.4
Q38.2
95-th percentile10.1
Maximum11.1
Range6.1
Interquartile range (IQR)2.2

Descriptive statistics

Standard deviation1.505817675
Coefficient of variation (CV)0.205224426
Kurtosis-0.6948940694
Mean7.337419355
Median Absolute Deviation (MAD)1.1
Skewness0.3129023474
Sum5686.5
Variance2.267486872
2020-11-12T21:01:48.222906image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
7.7303.9%
 
7303.9%
 
5.3303.9%
 
8.2273.5%
 
7.4273.5%
 
7.5273.5%
 
7.3273.5%
 
5.7253.2%
 
8243.1%
 
5.5243.1%
 
Other values (48)50465.0%
 
ValueCountFrequency (%) 
5212.7%
 
5.1182.3%
 
5.291.2%
 
5.3303.9%
 
5.4182.3%
 
ValueCountFrequency (%) 
11.130.4%
 
10.730.4%
 
10.691.2%
 
10.430.4%
 
10.330.4%
 

unemplt_rate
Real number (ℝ≥0)

Distinct count74
Unique (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.941161290322581
Minimum3.4
Maximum14.7
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:48.372527image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum3.4
5-th percentile3.7
Q14.8
median5.6
Q37
95-th percentile9.33
Maximum14.7
Range11.3
Interquartile range (IQR)2.2

Descriptive statistics

Standard deviation1.664338279
Coefficient of variation (CV)0.2801368617
Kurtosis1.203190535
Mean5.94116129
Median Absolute Deviation (MAD)1.1
Skewness0.9372684244
Sum4604.4
Variance2.770021905
2020-11-12T21:01:48.500953image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5.4324.1%
 
5.7314.0%
 
5.6314.0%
 
5.9263.4%
 
5.5243.1%
 
5233.0%
 
3.8233.0%
 
5.2222.8%
 
6202.6%
 
4.9202.6%
 
Other values (64)52367.5%
 
ValueCountFrequency (%) 
3.491.2%
 
3.5121.5%
 
3.650.6%
 
3.7141.8%
 
3.8233.0%
 
ValueCountFrequency (%) 
14.710.1%
 
13.310.1%
 
11.110.1%
 
10.820.3%
 
10.430.4%
 

int_rate
Real number (ℝ≥0)

Distinct count139
Unique (%)17.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.579109677419355
Minimum0.25
Maximum14.0
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:48.900121image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum0.25
5-th percentile0.75
Q12.75
median4.5
Q36
95-th percentile9.683
Maximum14
Range13.75
Interquartile range (IQR)3.25

Descriptive statistics

Standard deviation2.825039179
Coefficient of variation (CV)0.6169407108
Kurtosis0.9312057581
Mean4.579109677
Median Absolute Deviation (MAD)1.5
Skewness0.8509274036
Sum3548.81
Variance7.980846364
2020-11-12T21:01:49.053398image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3779.9%
 
0.75719.2%
 
6476.1%
 
5425.4%
 
4.5395.0%
 
5.5364.6%
 
3.5324.1%
 
4293.7%
 
5.25243.1%
 
7222.8%
 
Other values (129)35645.9%
 
ValueCountFrequency (%) 
0.2550.6%
 
0.5141.8%
 
0.75719.2%
 
0.8310.1%
 
1121.5%
 
ValueCountFrequency (%) 
1450.6%
 
13.8710.1%
 
13.0310.1%
 
1360.8%
 
12.9410.1%
 

cpi_rent
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count684
Unique (%)88.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean139.36684129032258
Minimum35.9
Maximum341.95
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:49.212612image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum35.9
5-th percentile38.07
Q149.8
median126.6
Q3210.45
95-th percentile305.7476
Maximum341.95
Range306.05
Interquartile range (IQR)160.65

Descriptive statistics

Standard deviation90.4683866
Coefficient of variation (CV)0.6491385308
Kurtosis-0.9653871985
Mean139.3668413
Median Absolute Deviation (MAD)78.5
Skewness0.5150686991
Sum108009.302
Variance8184.528975
2020-11-12T21:01:49.342697image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4050.6%
 
39.250.6%
 
40.340.5%
 
37.640.5%
 
40.940.5%
 
38.740.5%
 
41.440.5%
 
38.140.5%
 
36.740.5%
 
40.540.5%
 
Other values (674)73394.6%
 
ValueCountFrequency (%) 
35.930.4%
 
3610.1%
 
36.110.1%
 
36.210.1%
 
36.420.3%
 
ValueCountFrequency (%) 
341.9510.1%
 
341.29410.1%
 
340.81110.1%
 
340.13510.1%
 
339.51910.1%
 

homePrice_index
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count399
Unique (%)99.3%
Missing373
Missing (%)48.1%
Infinite0
Infinite (%)0.0%
Mean128.83690298507463
Minimum63.755
Maximum219.819
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:49.480809image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum63.755
5-th percentile72.59445
Q180.80025
median134.9045
Q3167.30875
95-th percentile205.0559
Maximum219.819
Range156.064
Interquartile range (IQR)86.5085

Descriptive statistics

Standard deviation45.68944536
Coefficient of variation (CV)0.3546301122
Kurtosis-1.343375444
Mean128.836903
Median Absolute Deviation (MAD)46.4595
Skewness0.1845956994
Sum51792.435
Variance2087.525417
2020-11-12T21:01:49.621745image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
166.67520.3%
 
78.17520.3%
 
76.59920.3%
 
127.1510.1%
 
149.62510.1%
 
182.72210.1%
 
90.88510.1%
 
176.63710.1%
 
159.3810.1%
 
140.16410.1%
 
Other values (389)38950.2%
 
(Missing)37348.1%
 
ValueCountFrequency (%) 
63.75510.1%
 
64.15610.1%
 
64.49110.1%
 
64.99410.1%
 
65.56810.1%
 
ValueCountFrequency (%) 
219.81910.1%
 
218.610.1%
 
217.32310.1%
 
215.1610.1%
 
213.25510.1%
 

newHouse_starts
Real number (ℝ≥0)

MISSING

Distinct count572
Unique (%)77.4%
Missing36
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean1428.3031123139378
Minimum478.0
Maximum2494.0
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:49.770263image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum478
5-th percentile694.9
Q11190
median1452
Q31652.5
95-th percentile2072.3
Maximum2494
Range2016
Interquartile range (IQR)462.5

Descriptive statistics

Standard deviation391.1805807
Coefficient of variation (CV)0.2738778466
Kurtosis0.03770518926
Mean1428.303112
Median Absolute Deviation (MAD)238
Skewness-0.04626757492
Sum1055516
Variance153022.2468
2020-11-12T21:01:49.907684image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
152440.5%
 
146740.5%
 
124640.5%
 
169840.5%
 
149140.5%
 
142130.4%
 
161430.4%
 
164830.4%
 
159030.4%
 
104630.4%
 
Other values (562)70490.8%
 
(Missing)364.6%
 
ValueCountFrequency (%) 
47810.1%
 
49010.1%
 
50510.1%
 
51710.1%
 
53410.1%
 
ValueCountFrequency (%) 
249410.1%
 
248510.1%
 
248120.3%
 
242110.1%
 
239010.1%
 

ppi_resConstruct
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct count319
Unique (%)77.8%
Missing365
Missing (%)47.1%
Infinite0
Infinite (%)0.0%
Mean161.32682926829267
Minimum99.8
Maximum232.5
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:50.072010image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum99.8
5-th percentile104.28
Q1132.225
median143.9
Q3204.275
95-th percentile225.52
Maximum232.5
Range132.7
Interquartile range (IQR)72.05

Descriptive statistics

Standard deviation40.02093538
Coefficient of variation (CV)0.24807365
Kurtosis-1.32839534
Mean161.3268293
Median Absolute Deviation (MAD)32.25
Skewness0.2130596817
Sum66144
Variance1601.675269
2020-11-12T21:01:50.218436image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
142.470.9%
 
137.950.6%
 
141.350.6%
 
14150.6%
 
99.940.5%
 
112.730.4%
 
140.330.4%
 
141.130.4%
 
184.430.4%
 
21130.4%
 
Other values (309)36947.6%
 
(Missing)36547.1%
 
ValueCountFrequency (%) 
99.810.1%
 
99.940.5%
 
10020.3%
 
100.110.1%
 
100.410.1%
 
ValueCountFrequency (%) 
232.510.1%
 
232.120.3%
 
231.710.1%
 
231.310.1%
 
231.210.1%
 

resConstruct_spending
Real number (ℝ≥0)

MISSING

Distinct count222
Unique (%)100.0%
Missing553
Missing (%)71.4%
Infinite0
Infinite (%)0.0%
Mean440093.990990991
Minimum244399.0
Maximum684482.0
Zeros0
Zeros (%)0.0%
Memory size6.1 KiB
2020-11-12T21:01:50.375565image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Quantile statistics

Minimum244399
5-th percentile252704.75
Q1331524.25
median444576
Q3545365.75
95-th percentile622650.65
Maximum684482
Range440083
Interquartile range (IQR)213841.5

Descriptive statistics

Standard deviation124743.7025
Coefficient of variation (CV)0.2834478659
Kurtosis-1.161382851
Mean440093.991
Median Absolute Deviation (MAD)103266.5
Skewness-0.08536787214
Sum97700866
Variance1.556099132e+10
2020-11-12T21:01:50.531147image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
36931510.1%
 
60917010.1%
 
46559210.1%
 
59018910.1%
 
25773710.1%
 
55741110.1%
 
54409610.1%
 
40209110.1%
 
25096510.1%
 
30020010.1%
 
Other values (212)21227.4%
 
(Missing)55371.4%
 
ValueCountFrequency (%) 
24439910.1%
 
24522610.1%
 
24798110.1%
 
24880810.1%
 
24892910.1%
 
ValueCountFrequency (%) 
68448210.1%
 
68126310.1%
 
67613510.1%
 
67445710.1%
 
67075710.1%
 

Interactions

2020-11-12T21:01:22.023381image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:22.221223image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:22.393350image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:22.578571image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:22.739695image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:22.908513image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:23.091507image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:23.398768image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:23.614453image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:23.790173image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:23.947367image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:24.162419image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:24.342563image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:24.566684image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:24.765718image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:24.934091image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:25.091042image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:25.256278image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:25.449490image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:25.621511image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:25.812365image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:26.058531image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:26.249509image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:26.494562image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:26.689600image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:26.918820image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:27.167826image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:27.423379image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:27.704455image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:27.887943image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.028138image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.207123image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.381588image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.557625image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.737074image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:28.921473image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:29.097224image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:29.397321image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:29.772496image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:30.077077image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:30.322742image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:30.548260image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:30.738478image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:30.943415image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:31.145058image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:31.433973image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:31.697411image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:31.962502image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:32.184699image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:32.357331image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:32.598913image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:32.984202image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:33.280526image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:33.514967image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:33.771825image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:34.015696image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:34.656567image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:35.125900image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:35.438368image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:35.687419image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:35.952181image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:36.243423image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:36.465095image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:36.778482image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:37.035882image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:37.229395image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:37.439857image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:37.732568image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:38.028320image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:38.249362image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:38.543575image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:39.134202image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:39.314708image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:39.732818image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:39.930224image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:40.133374image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:40.366442image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:40.547529image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:40.831853image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:41.166915image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:41.369443image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:41.549555image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:41.729098image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:41.899580image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.087666image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.263345image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.421215image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.593749image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.781417image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:42.990697image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:43.197693image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:43.396150image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:43.569183image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:43.774135image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:43.952247image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:44.127925image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:44.511673image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:44.699945image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:44.870622image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:45.049481image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Correlations

2020-11-12T21:01:50.702493image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-12T21:01:51.003531image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-12T21:01:51.280662image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-12T21:01:51.545492image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-11-12T21:01:45.389692image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:45.863915image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:46.173079image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/
2020-11-12T21:01:46.419626image/svg+xmlMatplotlib v3.3.0, https://matplotlib.org/

Sample

First rows

DATEuspop_growthmed_hIncomerentl_vacnyRateunemplt_rateint_ratecpi_renthomePrice_indexnewHouse_startsppi_resConstructresConstruct_spending
01956-01-01NaNNaN6.24.02.5035.9NaNNaNNaNNaN
11956-02-01NaNNaN6.23.92.5035.9NaNNaNNaNNaN
21956-03-01NaNNaN6.24.22.5035.9NaNNaNNaNNaN
31956-04-01NaNNaN5.94.02.6536.0NaNNaNNaNNaN
41956-05-01NaNNaN5.94.32.7536.1NaNNaNNaNNaN
51956-06-01NaNNaN5.94.32.7536.2NaNNaNNaNNaN
61956-07-01NaNNaN6.34.42.7536.4NaNNaNNaNNaN
71956-08-01NaNNaN6.34.12.8136.4NaNNaNNaNNaN
81956-09-01NaNNaN6.33.93.0036.5NaNNaNNaNNaN
91956-10-01NaNNaN5.83.93.0036.5NaNNaNNaNNaN

Last rows

DATEuspop_growthmed_hIncomerentl_vacnyRateunemplt_rateint_ratecpi_renthomePrice_indexnewHouse_startsppi_resConstructresConstruct_spending
7652019-10-010.473954NaN6.43.62.25334.680212.1651340.0227.5563877.0
7662019-11-010.473954NaN6.43.52.25335.819212.3001371.0226.9574079.0
7672019-12-010.473954NaN6.43.52.25336.789212.4131587.0226.7579863.0
7682020-01-010.473954NaN6.63.62.25337.825212.4701617.0228.0596728.0
7692020-02-010.473954NaN6.63.52.25338.616213.2551567.0227.3600581.0
7702020-03-010.473954NaN6.64.40.25339.519215.1601269.0224.5595963.0
7712020-04-010.473954NaN5.714.70.25340.135217.323934.0215.9569892.0
7722020-05-010.473954NaN5.713.30.25340.811218.6001038.0217.3549977.0
7732020-06-010.473954NaN5.711.10.25341.294219.8191220.0221.4542307.0
7742020-07-010.473954NaN5.710.20.25341.950NaN1496.0225.3NaN